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Abstract 


The  present  paper  deals  with  the  problem  of  computing  a  few  of  the  eigenvalues  with  largest  (or 
smallest )  rtal  parts,  of  a  large  sparse  aonsymmetric  matrix.  We  present  a  general  acceleration 
technique  based  on  Chebyshev  polynomials  and  discuss  its  practical  application  to  Arnoldi’s  method 
and  the  snbspace  iteration  method.  The  resulting  algorithms  are  compared  with  the  classical  ones  in 
a  few  experiments  which  exhibit  a  sharp  superiority  of  the  Arnoldi*ChebysheT  approach. 
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I.  Introduction 

An  important  number  of  applications  in  applied  sciences  and  engineering  require  the  numerical 
solution  of  a  large  nonsymmetric  matrix  eigenvalue  problem.  Such  is  the  case  for  example,  in 
economical  modeling  [5,  Id]  where  the  stability  of  a  model  is  interpreted  in  terms  of  the  dominant 
eigenvalues  of  a  large  nonsymmetric  matrix  A.  In  Markov  chain  modeling  of  queueing 
networks  (17,  18,  34],  one  is  interested  in  an  eigenvector  associated  with  the  eigenvalue  unity  of  the 
transpose  of  a  large  nonsymmetric  stochastic  matrix.  In  structural  engineering  [6,  9,  10,  32],  and  in 
fluid  mechanics  [33]  one  often  seeks  to  solve  a  bifurcation  problem  whereby  a  few  of  the  eigenvalues 
of  a  family  of  nonsymmetric  matrices  A(a)  are  computed  for  several  values  of  the  parameter  a  in 
order  to  determine  a  critical  value  ae  such  that  some  particular  eigenvalue  changes  sign,  or  crosses 
the  imaginary  axis.  When  it  is  a  pair  of  complex  eigenvalues  that  crosses  the  imaginary  axis,  the 
bifurcation  point  ac  is  referred  to  as  a  Hopf  bifurcation  point.  This  important  problem  was  recently 
examined  by  Jepson  (13]  who  proposes  several  techniques  most  of  which  deal  with  small  dimension 
cases.  Commun  bifurcation  problems  can  be  solved  by  computing  a  few  of  the  eigenvalues  with 
largest  real  parts  of  A(a)  and  then  detect  when  one  of  them  changes  sign.  The  study  of  stability  of 
electrical  networks  is  yet  another  interesting  example  requiring  the  numerical  computation  of  the 
eigenvalue  of  largest  real  part.  Finally,  we  can  mention  the  occurence  of  nonsymmetric  generalised 
eigenvalue  problems  when  solving  the  Riccati  equations  by  the  Schur  techniques  [20]. 

As  suggested  by  the  above  important  applications,  we  will  primarily  be  concerned  with  the 
problem  of  computing  a  few  of  the  eigenvalues  with  algebraically  largest  real  parts,  and  their 
associated  eigenvectors,  of  a  large  nonsymmetric  matrix  A.  The  literature  in  this  area  has  been 
relatively  limited  as  compared  with  that  of  the  more  common  symmetric  eigenvalue  problem.  The 
subspace  iteration  method  (2,  4,  13,  30,  37],  seems  to  have  been  the  preferred  algorithm  for  many 
years,  and  is  still  often  recommended  [12].  However,  this  algorithm  computes  the  eigenvalues  of 
largest  modulus  while  the  above  mentioned  applications  require  those  of  largest  (or  smallest) 
algebraically  real  parts. 

Furthermore,  it  is  well-known  in  the  symmetric  case  that  the  Lanesos  algorithm  is  far  superior  to 
the  subspace  iteration  method  [24].  Some  numerical  experiments  described  in  [30]  indicate  that 
Krylov  subspace  based  methods  can  be  more  effective  in  the  nonsymmetric  case  as  well.  There  are 
two  known  algorithms  that  generalize  the  symmetric  Lanezos  method  to  nonsymmetric  matrices: 
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1.  The  Lancsos  biorthogonalization  algorithm  [19]; 

2.  Arnoldi’s  method  [1]. 

The  first  method  has  been  recently  ressucitated  by  Pariett  and  Taylor  (26,  38]  who  propose  an 
interesting  way  of  avoiding  break-downs  from  whieh  the  method  may  otherwise  suffer.  The  second 
was  examined  in  [30]  where  several  alternatives  have  been  suggested  and  in  [31]  where  some 
additional  theory  was  proposed.  With  the  appropriate  initial  vectors,  both  of  these  approaches 
reduce  to  the  symmetric  Lanezos  algorithm  when  the  matrix  A  is  symmetric. 

In  the  present  paper  we  will  describe  a  hybrid  method  based  on  the  Chebyshev  iteration  algorithm 
and  Arnoldi's  method.  These  two  methods  taken  alone  face  a  number  of  limitations  but,  as  will  be 
seen,  when  combined  they  take  full  advantage  of  each  other’s  attractive  features. 

The  principle  of  Arnoldi’s  method  is  the  following:  start  with  an  initial  vector  v,  and  at  every  step 
compute  AVj  and  orthogonalize  it  against  all  previous  vectors  to  obtain  vj+,.  At  step  m,  this  will 
build  a  basis  {Vj}i— ,  B  of  the  Krylov  subspace  Km  spanned  by  vp  AVj,..<Am*lv1.  The  restriction  of  A 
to  Km  is  then  represented  in  the  basis  (v.)  by  a  Heseenberg  matrix  whose  elements  are  the 
coefficients  used  in  the  orthoyonalitaUoa  process.  The  eigenvalues  of  this  Hessenberg  matrix  will 
provide  approximations  to  the  eigenvalues  of  A.  Clearly,  this  simple  procedure  has  the  serious 
drawback  of  requiring  the  presence  in  memory  of  all  previous  vectors  at  a  given  step  m.  Also  the 
amount  of  work  increases  drastically  with  the  step  number  m.  Several  variations  on  this  basic  scheme 
have  been  suggi-sted  in  [30]  to  overcome  this  difficulty,  the  most  obvious  of  which  b  to  use  the 
method  iteratively,  i.e.  to  restart  the  process  after  every  m  steps.  This  alternative  was  shown  to  be 
quite  effective  when  the  number  of  wanted  eigenvalues  b  very  small,  and  outperformed  the  subspace 
iteration  by  a  wide  margin  in  an  application  related  to  the  Markov  chain  modeling  of  queueing 
networks  [30]. 

There  are  instances,  however,  where  the  iterative  Arnold!  algorithm  exhibits  a  poor  performance. 
In  some  cases  the  minimum  number  of  steps  m  that  must  be  performed  in  each  inner  iteration  in 
order  to  ensure  convergence  of  the  process,  b  too  large.  Another  typical  ease  of  poor  performance  b 
when  the  eigenvalues  that  are  to  be  computed  are  clustered  while  the  unwanted  ones  have  t  very 
favorable  separation  as  b  illustrated  in  the  next  figure,  for  example: 
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Then,  it  is  observed  that  the  iterative  process  will  have  some  diffienlties  to  extract  the  wanted 
eigenvalues  because  the  process  will  tend  to  be  highly  dominated  by  the  convergence  in  the 
eigenvectors  associated  with  the  right  part  of  the  spectrum.  These  difficulties  may  be  overcome  by 
taking  a  large  enough  m  but  this  can  become  expensive  and  impractical. 

In  order  to  avoid  these  shortcomings  of  the  iterative  Arnoidi  process,  and,  more  generally,  to 
improve  its  overall  performance  we  propose  to  use  it  in  conjunction  with  the  Chebyshev  iteration. 
The  main  part  of  this  hybrid  algorithm  is  a  Chebyshev  iteration  which  computes  a  vector  of  the  form 
t;sPj(A)z0,  where  p;  is  a  polynomial  of  degree  i,  and  zQ  is  an  initial  vector.  The  polynomial  p;  is 
chosen  so  as  to  highly  amplify  the  components  of  z0  in  the  direction  of  the  desired  eigenvectors  while 
damping  those  in  the  remaining  eigenvectors.  A  suitable  such  polynomial  can  be  expressed  in  terms 
of  a  Chebyshev  polynomial  of  degree  i  of  the  first  kind.  Once  z^p^A)^  is  computed,  a  few  steps  of 
Arnoldi's  method,  starting  with  are  carried  out  iu  order  to  extract  from  *.  the  desired 

eigenvalues. 

We  will  also  discuss  the  implementation  of  a  Chebyshev  accelerated  subspace  iteration  algorithm 
following  ideas  developed  in  [31]. 

In  the  context  of  large  nonsymmetric  linear  systems,  extensive  work  has  been  devoted  to  the  use  of 
Chebyshev  polynomials  for  accelerating  linear  iterative  methods  [11,  21,  22,  23,  41].  Manteuffel’s 
work  on  the  determination  of  the  optimal  ellipse  containing  the  convex  hull  of  the  spectrum  of  A 
[23],  has  been  decisive  in  making  the  method  reliable  and  effective.  For  eigenvalue  problems, 
Rutiahauser  has  suggested  the  use  of  Chebyshev  polynomials  for  accelerating  the  subspace  iteration 
in  the  symmetric  case  [29,  40].  However,  Chebyshev  acceleration  has  received  little  attention  as  a  tool 
for  accelerating  the  nonsymmetric  eigenvalue  algorithms.  The  algorithms  that  we  propose  in  this 
paper  can  be  regarded  as  a  simple  adaptation  of  Manteuffel’s  algorithm  the  nonsymmetric  eigenvalue 
problem. 

We  point  out  that  a  hybrid  Aruoldi-Chebyshev  method  for  solving  nonsymmetric  linear  systems 


using  ideas  similar  to  the  ones  developed  here  is  currently  being  developed  [8]. 

In  Section  2,  we  will  describe  the  basie  Chebyshev  iteration  for  computing  an  eigenpair  and  analyse 
its  convergence  properties.  In  Sections  3,  4  and  5  we  will  show  how  to  combine  the  Chebyshev 
iteration  with  Arnoldi's  method  and  with  the  subspaee  iteration  method.  In  Section  6  we  will  report 
a  few  numerical  experiments,  and  in  the  last  section  we  will  draw  a  tentative  conclusion. 

2.  Chebyshev  iteration  for  computing  eigenvalues  of  nonsymmetric  matrices 

2.1.  The  basic  iteration 

Let  A  be  a  nonsymmetric  real  matrix  of  dimension  N  and  consider  the  eigenvalue  problem: 

Au*Xu  (1) 

Let  Xj . XN  be  the  eigenvalues  of  A  labelled  in  decreasing  order  of  their  real  parts  and  suppose 

that  we  are  interested  in  which,  to  start  with,  is  assumed  to  be  real. 

Consider  a  polynomial  iteration  of  the  form:  aB—pn(A)t0,  where  is  some  initial  vector  and 
where  pn  is  a  polynomial  of  degree  n.  We  would  like  to  choose  pB  in  such  a  way  that  the  vector  sB 
converges  rapidly  towards  an  eigenvector  of  A  associated  with  Xj  as  n  tends  to  infinity.  Assuming 
for  simplicity  that  A  is  diagonalizabie,  let  us  expand  zQ  and  zB»-pn(A)z0  in  the  eigenbasis  {u;}: 


■„  -  j-/i  tjsi »,  -  *i  ”,  +  |*i  p/v  »i  p) 

Expansion  (3)  shows  that  if  zB  is  to  be  a  good  approximation  of  the  eigenvector  Uj,  then  every 
pB(Xj),  with  jy*l,  must  be  small  in  comparison  with  pB(X{) .  This  leads  us  to  seek  for  a  polynomial 
which  is  small  in  the  discrete  set  R«»{X2,Xj,...^n}  and  which  satisfies  the  normalisation  condition 

Po(\)  ™  1*  H) 

An  ideal  such  polynomial  would  be  the  one  which  minimises  the  (discrete)  uniform  norm  on  the 
discrete  set  R  over  all  polynomials  of  degree  n  satisfying  (4).  However,  this  polynomial  is  clearly 


impossibe  to  compute  without  the  knowledge  of  all  eigenvalues  of  A  and  this  approach  is  therefore  of 
li.ile  interest.  A  simple  and  more  reasonable  alternative,  known  for  a  long  time  [39],  is  to  replace  the 
discrete  minimax  polynomial  by  the  continuous  one  on  a  domain  containing  R  but  excluding  Xj.  Let 
E  be  such  a  domain  in  the  complex  plane,  and  let  Pn  denote  the  space  of  all  polynomials  of  degree 
not  exceeding  n.  We  are  thus  seeking  for  a  polynomial  pn  which  achieves  the  minimum 

min  max  |p(X)|  (5) 

per,.  **,)*»  *€E 

For  an  arbitrary  domain  E,  it  is  usually  difficult  to  explicitly  solve  the  above  minimax  problem. 
Iterative  methods  can  be  used,  however,  and  the  exploitation  of  the  resulting  minimax  polynomials 
for  solving  eigenvalue  problems  constitutes  a  promising  research  area.  An  alternative  around  that 
difficulty  is  to  restrict  E  to  be  an  ellipse  having  its  center  on  the  real  line,  and  containing  the 
unwanted  eigengalues  Xj,i— 2,..N. 

Let  E(d,c.a)  be  an  ellipse  of  center  d,  with  d  real,  focii  d+c,  d-c,  and  major  semi  axis  a,  and 
containing  the  set  R* {X„,..XN}.  Since  the  spectrum  of  A  is  symmetric  with  respect  to  the  real  axis, 
we  will  restrict  E(d,c,a)  to  being  symmetric  as  well.  In  other  words,  the  main  axis  of  the  ellipse  must 
be  either  the  real  axis  or  must  be  parallel  to  the  imaginary  axis.  Therefore,  a  and  c  are  either  real  or 
purely  imaginary,  see  Figure  2-1. 


Then  it  is  known  that,  when  E  is  the  ellipse  E(d,c,a)  in  (a),  the  best  minimax  polynomial  is  the 
polynomial 


Pn(X) 


T„[(X-d)/c)] 

Tn[(X,-d)/c)l 


(«) 


where  Tn  is  the  Chebyshev  polynomial  of  degree  n  of  the  first  kind,  see  [3,  21,  41], 


The  computation  of  zn,  n«l,2..  is  simplified  by  the  three  term  recursion  of  the  Chebyshev 
polynomials: 


T,(X)-X  ;  T0(X)-1. 

Tn+l(X)-2XTn(X)-Tn.l(X),  n— 1,2,.. 


liTTS  r-:'. 
D'r*  t  ;• 

0(9  It 


o 


Letting  f0“T  [(X.-d)/cj,  n—»0,l,..  we  obtain 
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Figure  8*1:  Ellipses  containing  the  set  R  of 
the  remaining  eigenvalues 
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Let  us  transform  this  further  by  setting 
Pn+i(M  -2(tb+1^  Pa(X)-^n+lPlhl(X) 

A  straightforward  manipulation  using  the  definitions  of  i-  and  the  three  term  recurrence 
relation  of  the  Chebyshev  polynomials  shows  that  er.r  i  can  be  obtained  from  the  recursion: 


<r,—  c/(Aj— d) 


'»+!“ 


2/*l  -  <r, 


»  n«l,2,.. 


The  above  two  recursions  can  now  be  assembled  together  to  yield  the  following  basic  algorithm  for 
computing  t;=Pj(A)z0,  i—1,2,..  : 


Algorithm:  Chebyshev  Iteration 


1.  Start:  Choose  an  arbitrary  initial  vector  z0;  Compute 

—  «  /  (Xt  -  d) 

— ‘(A-dI)i0 
c 

2.  Iterate:  for  n»1.2,...  until  convergence  do: 

1 


n+1 


2/^i  - 


•1,2,. 


*n+l  “  2 


n+1 


(A  -  d  I)  in  -  on<rn+1  a^j 


(7) 

(8) 

(9) 


(10) 


Note  that  the  above  algorithm  uses  the  eigenvalue  X1  which  is  unknown.  Recall,  however,  that  in 
(0),  Xj  appears  in  the  denominator  which  is  used  for  sealing  purposes  only.  We  can  therefore  choose 
to  scale  the  polynomial  (ft)  by  Ta((i*-d)/c]  where  v  is  some  approximation  of  Xj.  This  leads  to 
replacing  Xj  by  v  in  (7). 

Another  important  detail,  which  we  have  omitted  to  discuss  for  the  sake  of  clarity,  concerns  the 


case  when  c  is  purely  imaginary.  It  can  be  shown  quite  easily  that  even  in  this  situation  the  above 
recursion  can  still  be  carried  out  in  real  arithmetic.  The  reason  for  this  is  that  the  scalars  i— 1,... 
become  all  purely  imaginary  as  can  easily  be  shown  by  induction.  Hence  the  scalars  <rn+1/c  and 
*0+1*0  *n  *b°Te  algorithm  are  real  numbers.  Thus,  the  primary  reason  for  sealing  by 
Tn[(X1-d)/c]  in  (6)  is  to  avoid  overflow  but,  as  was  just  explained,  a  secondary  reason  is  to  avoid 
complex  arithmetic  when  c  is  purely  imaginary. 


2.2.  Convergence  properties 

In  order  to  understand  the  convergence  properties  of  the  sequence  of  approximations  sq  consider  its 
expansion  (3): 

N  N 


S'i  PA>  «i 


We  would  like  to  examine  the  behavior  of  each  coefficient  of  u;,  for  iy*l.  We  have: 
T[(X- — d)/c)J 

p  (x.)  -  a?..i . -■■■■■- 

"  '  Tn[(X1-d)/c)J 


From  one  of  the  various  ways  of  defining  the  Chebyshev  polynomials  in  the  complex  plane  [27],  the 
above  expression  can  be  rewritten  as 


pA> 


w?+  wr“ 

i  i 

w?  +  wTn 


(ID 


where  w;  represents  the  root  of  largest  modulus  of  the  equation  in  w: 

^w+w"1)— (\-d)/c.  (12) 

From  (11),  pn(Xj)  is  equivalent  to  [wj/wj]",  hence  the  definition: 

Definition  1:  We  will  refer  to  k.m|w./Wj|  as  the  convergence  ratio  of  relative  to  the 
parameters  d,c. 

This  definition  must  be  understood  in  the  following  sense:  each  coefficient  in  the  eigenvector  u.  of 
the  expansion  (3)  behaves  like  «?,  as  n  tends  to  infinity. 


One  of  the  most  important  features  in  Chebyshev  iteration  lies  in  equation  (12).  Observe  that 
there  are  infinitely  many  points  in  the  complex  plane  which  have  the  same  convergence  ratio.  Thus  a 
great  deal  of  simplification  ean  be  achieved  by  locating  those  points  that  are  real  as  it  is  preferable 


to  deal  with  real  quantities  than  imaginary  ones  in  the  above  expression  defining  The  well  known 
mapping  J(w)— ^w+w-*),  often  referred  to  as  the  Jookowski  transform  [27],  maps  a  circle  into  an 
ellipse  in  the  complex  plane.  More  precisely,  for  w—pe^,  J(w)  belongs  to  an  ellipse  of  center  the 
origin,  focal  distance  1,  and  major  semi  axis  a-J(p+p'  *).  Given  the  major  semi  axis  o,  p  is 
determined  by  p—j{a  +  (a  S-l)1/2].  As  a  consequence  the  convergence  ratio  k.  is  simply  pj/p,  where 
+  (o^-l)1/2]  and  a.  is  the  major  semi  axis  of  the  ellipse  centered  at  the  origin,  with  focal 
distance  one  and  passing  through  (Xj-d)/c.  Since  a^o.,  i-«2,3..N,  it  is  easy  to  see  that  pj>p.,i>l, 
and  hence  that  the  process  will  converge.  Note  that  there  is  a  further  mapping  between  Xj  and 
(Aj — d)/c  which  transforms  the  ellipse  E(d,c,ap  into  the  ellipse  E(0,l,ftj)  where  and  ctj  are  related 
by  3j/c.  Therefore,  the  above  expression  for  the  convergence  ratio  can  be  rewritten  as: 

“  ,J>'  ~  »,  +  (»,s-l)>/2  (13) 

where  a-  is  the  major  semi  axis  of  the  ellipse  of  center  d,  focal  distance  c,  passing  through  X..  The 
ratio  k(X)  can  obviously  be  also  defined  for  any  value  X  of  the  complex  plane  by  replacing  X;  by  X. 
From  the  expansion(3),  the  vector  z0  converges  to  fa,  like  K-~n  where  Kj  is  the  largest  of  the 
convergence  ratios  i«—2..N. 

The  ah' — *  algorithm  ignores  the  following  important  points: 

•  It  is  unrealistic  to  assume  that  the  parameters  d  and  c  are  known  beforehand  and  some 
adaptive  scheme  must  be  implemented  in  order  to  estimate  them  dynamically. 

•  The  algorithm  does  not  handle  the  computation  of  more  than  one  eigenvalue.  In 
particular  what  to  do  in  case  X,  is  complex,  i.e.  when  X,  and  Xg— X,  forms  a  complex 
pair? 

Suppose  that  E(d,c,a)  contains  all  the  eigenvalues  of  A  except  for  a  few.  Looking  closely  at  the 
expansion  of  *n,  we  observe  that  it  contains  more  than  just  an  approximation  to  u,  because  we  can 
write: 


z_  —  +  9.  uL  +...  9.  u.  +  < 

n  ii  ij  ij  «r  i. 


where  X.  ,..X.  are  the  eigenvalues  outside  E(d,e,a)  and  <  is  a  small  term  in  comparison  with  the  first  r 
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ones.  All  we  need,  therefore,  b  some  powerful  method  to  extract  those  eigenvalues  from  the  single 
vector  <n.  We  will  refer  to  snch  a  method  as  a  purification  process.  One  such  process  among  many 
others  b  the  Arnold!  method  considered  in  the  next  section. 

3.  Arnoldi’a  method  ns  a  purification  process 

A  brief  description  of  Arnold! 's  method  b  the  following*. 

Arnoldi’s  Algorithm 

1.  Start:  Choose  an  initial  vector  v}  of  norm  unity,  and  a  number  of  steps  m. 

2.  Iterate:  For  js*l,2,..  m  do: 


Vi-*V£V. 

(15) 

with  hjj— (Avj.Vj),  i— l,..j 

(16) 

V.J-H  Vi  II 

(17) 

Vi-WVu 

(18) 

This  algorithm  produces  an  orthonormal  basb  Vm«»[vj,v2,..vn)]  of  the  Krylov  subsapee 
Kin*span{Vj,AVj,...Am'1Vj}.  In  this  basb  the  restriction  of  A  to  Km  b  represented  by  the  upper 
Hessenberg  matrix  Hm  whose  entries  are  the  elements  b-  produced  by  the  algorithm.  The  eigenvalues 
of  A  are  approximated  by  those  of  Hm  which  b  such  that  HB*V^AVm.  The  associated  approximate 
eigenvectors  are  given  by: 

5i  *  Vm  h  (19) 

where  y-(,  b  an  eigenvector  of  Hm  associated  with  the  eigenvalue  A..  We  will  assume  throughout  that 
the  pair  of  eigenvectors  associated  with  a  conjugate  pair  of  eigenvalues  are  normalised  so  that  they 
are  conjugate  to  each  other.  Note  that  3-  has  the  same  Euclidean  norm  as  f ..  The  following  relation 
is  extremely  useful  for  obtaining  the  residual  norm  of  5-  without  even  computing  it  explicitly: 

II  (A  -  Xjl)  5,11  -  b„*1,„|#,| 


(20) 


in  which  em-»(0,0,...0,l)T.  This  result  is  well  known  in  the  ease  of  the  symmetric  Lanczos 
algorithm  (25],  and  its  extension  to  the  nonsym metric  case  is  straightforward  [30]. 

The  method  of  Arnoldi  amounts  to  a  Galerkin  process  onto  the  Krylov  subspace  Kffi  [1,  30].  A  few 
variations  on  the  above  basic  algorithm  have  been  proposed  in  [30]  in  order  to  overcome  some  of  its 
impractical  features,  and  a  theoretical  analysis  was  presented  in  [31]. 

One  important  property  of  the  algorithm  is  that  if  the  initial  vector  Vj  is  exactly  in  an  invariant 

subspace  of  dimension  r  and  not  in  any  invariant  subspace  of  smaller  dimension,  i.e.  if  the  degree  of 

the  minimal  polynomial  of  Vj  is  r,  then  the  above  algorithm  cannot  be  continued  after  step  r,  because 

we  will  obtain  ||vr+1H»l“0.  However,  the  next  proposition  shows  that  in  this  case  Kr  will  be  invariant 

which  implies,  in  particular,  that  the  r  computed  eigenvalues  are  exact. 

Proposition  2:  Assume  that  the  degree  of  the  minimal  polynomial  of  v{  is  equal  to 
r.  Then  Arnoldi '»  method  stops  at  step  r  and  Kr  is  an  invariant  subspaee.  Furthermore, 
the  eigenvalues  of  Hr  are  the  eigenvalues  of  A  associated  with  the  invariant  subspace  Kf. 

Proof:  Without  loss  of  generality  we  will  assume  that  the  eigenvalues  associated  with  the 

invaraint  subspace  are  X-i,i»*l,r.  We  will  use  the  following  result  proved  in  [31]:  at  every  step  k  of 

Amoidi's  method,  the  characteristic  polynomial  of  Hfc  is  the  polynomial  which  minimizes  the  norm 

||p( AlVjD  over  all  monie  polynomials  of  degree  k.  By  assumption  there  is  a  monic  polynomial  p  of 

degree  r  such  that  p(A)Vj-»0,  namely  the  minimal  polynomial  of  vr  Therefore  the  characteristic 

polynomial  of  Hf  is  precisely  p.  Furthermore,  comparing  the  recurrence  relation  satisfied  by  the 

characteristic  polynomial  of  H^,  and  that  of  the  sequence  vk,  it  is  clear  that  we  have: 

vf+1«»a.p(A)vI  where  a  is  some  scalar.  Hence  vr+j™0.  Finally,  as  a  result  of  this  and  from  the 

relations  Av.-»E  h-.v.,  j— l,..r  we  clearly  have: 

J  ,.i  u  1 

AV-VHr 

which  means  that  Kr  is  invariant  and  completes  the  proofs 

4.  The  Arnold!- Chebyahev  method 

In  order  to  exploit  the  above  proposition,  notice  that  the  vector  zD  in  (14)  produced  after  n  steps  of 
Chebysbev  with  the  appropritate  parameters  c,d,  will  almost  be  in  the  invariant  subspace  spanned  by 
the  eigenvectors  u-  ,u-  ,...u. .  Suppose  that  we  can  find  an  ellipse  E(d,c,a)  that  contains  all  the 

*1  *»  ‘r 

eigenvalues  of  A  except  the  r  wanted  ones,  i.e.  the  r  eigenvalues  of  A  with  largest  real  parts.  We  will 
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describe  in  a  moment  an  adaptive  way  of  getting  such  an  ellipse.  Then  an  appealing  algorithm  would 
be  to  run  a  certain  number  of  steps  of  the  Chebyshev  iteration  and  take  the  resulting  vector  zn  as 
initial  vector  in  the  Arnoldi  process.  From  the  Arnoldi  purification  one  obtains  a  set  of  m  eigenvalues 
r  of  which  are  approximations  to  the  r  wanted  ones,  while  the  remaining  ones  will  be  helpful  for 
adaptively  constructing  the  best  ellipse.  After  a  cycle  consisting  of  n  steps  of  Chebyshev  iteration 
followed  by  m  steps  of  the  purification  process,  the  accuracy  realised  for  the  r  rightmost  eigenpairs 
may  not  be  sufficient  and  restarting  will  then  be  necessary.  The  following  is  an  outline  of  a  simple 
algorithm  based  in  the  above  ideas: 

•  Start:  Choose  an  initial  vector  v]t  a  number  of  Arnoldi  steps  m  and  a  number  of 
Chebyshev  steps  n. 

•  Iterate: 

1.  Perform  m  steps  of  the  Arnoldi  algorithm  starting  with  Vj.  Compute  the  m 
eigenvalues  of  the  produced  Hessenberg  matrix.  Select  the  r  eigenvalues  of  largest 
real  parts  Xj,..Xf  and  take  R“{Xf+1,...Xm}.  If  satisfied  stop,  otherwise  continue. 

2.  Using  R,  obtain  the  new  estimates  of  the  parameters  d  and  c  of  the  best  ellipse. 

Then  compute  the  initial  vector  tg  for  the  Chebyshev  iteration  as  a  linear 
combination  of  the  eigenvectors  u.,  i«— »l,.jr. 

3.  Perform  n  steps  of  Chebyshev  iteration  to  obtain  t#.  Take  Vj—sn/||zB||  and  go 
back  to  1. 

Next,  some  important  details  left  unelear  in  the  above  simplistic  description  will  be  examined. 

4.1.  Getting  the  optimal  ellipse 

As  explained  earlier  we  would  like  to  find  the  ‘best'  ellipse  enclosing  the  set  R  of  remaining 
eigenvalues,  i.e.  the  eigenvalues  other  than  the  ones  with  the  r  algebraically  largest  real  parts.  We 
must  begin  by  clarifying  what  is  meant  by  "best"  in  the  present  context.  Consider  Figure 
4-1  representing  a  spectrum  of  some  matrix  A  and  suppose  that  we  are  interested  to  the  4  rightmost 
eigenvalues,  i.e.  r*M. 

la  the  context  of  linear  systems,  there  is  only  one  convergence  ratio  [21,  23],  and  the  best  ellipse  is 
defined  as  being  the  one  which  maximises  that  ratio.  In  our  situation  we  have  r  eigenvalues  and 


Figure  4*1:  Example  of  a  spectrum 

therefore  r  different  convergence  ratios  each  given  by  (13).  Recall  that  a^  is  the  major  semi  axis  of 
the  ellipse  with  center  d  and  focal  distance  c,  passing  through  Xj  and  that  the  unknowns  are  d  and  c. 

Initially,  assume  that  Xf  is  real.  It  is  easily  seen  from  our  comments  of  Section  2.2  that  if  we  draw 
a  vertical  line  passing  through  the  eigenvalue  Xf,  all  eigenvalues  on  the  right  of  that  line  will 
converge  faster  than  those  on  the  left.  This  will  be  true  for  any  ellipse  containing  R.  Therefore,  when 
Xf  is  real,  we  may  simply  define  the  best  ellipse  as  the  one  maximizing  the  convergence  ratio  icf  of  Xf. 

When  Xr  is  not  real,  the  situation  is  more  complicated.  We  could  still  attempt  to  maximise  the 
convergence  ratio  for  the  eigenvalue  Xr,  but  the  formulas  giving  the  optimal  ellipse  do  not  extend  to 
the  ease  where  Xf  is  complex  and  the  best  ellipse  becomes  difficult  to  determine.  But  this  is  not  the 
main  reason  why  this  choice  is  not  suitable.  A  close  look  at  Figure  4*2,  in  which  we  assume  r*->5, 
reveals  that  the  best  ellipse  for  Xf  may  not  be  a  good  ellipse  for  some  of  the  desired  eigenvalues. 
More  precisely,  in  the  figure,  the  pair  X,,  X}  is  enclosed  by  the  best  ellipse  for  Xj.  As  a  consequence 
the  components  in  Ug,u3  will  converge  more  slowly  than  those  in  some  of  the  undesired  eigenvectors, 


a 
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e.g.  XN  in  the  figure. 
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Figure  4-2:  Case  where  p  ■■  X  (complex):  the 
eigenvalues  X2  and  X3  are  inside  tne  “bat”  ellipse. 

The  figure  explains  the  difficulty  more  dearly:  the  problem  comes  from  the  relative  position  of  X 
and  X2  with  respect  to  the  rest  of  the  spectrum  and  it  can  be  resolved  by  just  maximising  the 
convergence  ratio  of  X2  instead  of  X&  in  this  case. 

In  a  more  complex  situation  it  is  unfortunately  more  difficult  to  determine  at  which  particular 
eigenvalue  Xk  or  more  generally  at  which  value  p  it  is  best  to  maximise  <c(/t).  Clearly,  one  could 
solve  the  problem  by  taking  p—Jie(Xr),  but  this  is  not  the  best  choice. 

As  an  alternative,  we  propose  to  take  advantage  of  the  previous  dlipse,  i.e.  the  ellipse  determined 
from  the  previous  purification  step,  as  follows.  We  determine  a  point  p  on  the  real  line  having  the 
tame  convergence  ratio  at  Xf  with  reaped  to  the  previout  ellipte.  The  next  best  ellipse  is  then 
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determined  so  as  to  maximite  the  convergence  ratio  for  that  value  p.  This  reduces  to  the  previous 
choice  p~Ae(Xr)  when  Xf  is  real.  At  the  very  first  iteration  one  can  set  p  to  be  i?e(Xf). 

The  question  which  we  have  not  yet  fully  answered  concerns  the  practical  determination  of  the 
best  ellipse.  At  a  typical  step  of  the  Arnoldi  process  we  are  given  m  approximations  X;,  i— l,..m  of 
the  eigenvalues  of  A.  This  approximate  spectrum  is  divided  in  two  parts:  the  r  wanted  eigenvalues 
Xr..Xr  and  the  set  R  of  the  remaining  eigenvalues  R— {Xr+1,Xf+2,..Xm}.  From  the  previous  ellipse 
and  the  previous  sets  R,  we  would  like  to  determine  the  next  estimates  for  the  optimal  parameters  d 
and  c. 

Fortunately,  a  similar  problem  was  solved  by  Manteuffei  [21,  23]  and  his  work  can  easily  be 
adapted  to  our  situation.  The  change  of  variables  £-«(/i  -  X)  transforms  p  into  the  origin  in  the  £ 
plane  and  the  problem  of  maximizing  the  convergence  for  p  is  transformed  into  one  of  maximizing  a 
similar  ratio  in  the  £  plane  for  the  origin.  An  effective  technique  for  solving  this  final  problem  has 
been  developed  in  [21,  23]  but  we  will  not  describe  it  here.  Thus,  all  we  have  to  do  is  pass  the  shifted 
eigenvalues  p-\-3  to  the  appropriate  codes  in  [21],  and  the  optimal  values  of  p-d  and  c  will  be 
returned. 

4.2.  Starting  the  Chebynhew  iteration 

Once  the  optimal  parameters  d  and  c  have  been  estimated  we  are  ready  to  carry  out  a  certain 
number  n  of  steps  of  the  Chebvshev  iteration  (10).  In  this  subsection  we  would  like  to  indicate  how 
to  select  the  starting  vector  zQ  for  this  iteration.  Before  doing  so,  we  wish  to  deal  with  a  minor 
difficulty  encountered  when  X,  is  complex.  Indeed,  it  was  mentioned  after  the  algorithm  described  in 
Section  2.1  that  the  eigenvalue  X,  in  (7)  should,  in  practice,  be  replaced  by  some  approximation  v  of 
X,.  If  X,  is  real  then  v  can  be  taken  to  be  i*—X ,  and  the  iteration  can  be  carried  out  in  real 
arithmetic  as  was  already  shown,  even  when  c  is  purely  maginary.  However,  the  iteration  will  become 
complex  if  Xj  is  complex.  To  avoid  this  it  suffices  to  take  v  to  be  the  point  on  the  real  axis  where  the 
ellipse  E(d,c,a,)  passing  through  X},  crosses  the  real  axis.  The  effect  of  the  corresponding  scaling  of 
the  Chebyshev  polynomial  will  be  identical  with  that  nsing  Xj  but  will  present  the  advantage  of 
avoiding  complex  arithmetic. 

Let  us  now  indicate  how  one  can  select  the. initial  vector  zQ.  In  the  hybrid  algorithm  outlined  in 


the  previous  section,  the  Chebyshev  iteration  comes  after  an  Arnold!  step.  It  u  then  desirable  to  start 
the  Chebyshev  iteration  by  a  vector  which  is  a  linear  combination  of  the  approximate  eigenvector* 
(19)  associated  with  the  rightmost  r  eigenvalues. 

Let  (■  be  the  scalars  of  the  desired  linear  combination.  Then  the  initial  vector  for  the  Chebyshev 
process  is 

H-5  «-£  -v.|  tfc 

Hence 

*0  -  vm  y  • whcre  y  -  5  $  f>-  <21) 

Therefore,  the  eigenvectors  Uj,  i“l  j  must  not  be  computed  explicitly.  We  only  need  to  compute  the 
eigenvectors  of  the  Hessenberg  matrix  Hm  and  to  select  the  appropriate  coefficents  An  important 
remark  is  that  if  we  choose  the  £’s  to  be  real  and  such  that  for  all  conjugate  pairs  X;, 

Xi+1»X..  then  the  above  vector  z#  is  real. 

Assume  that  all  eigenvectors,  exact  and  approximate,  are  normalized  so  that  their  2-norms  are 
equal  to  one.  One  desirable  objective  when  choosing  the  above  linear  combination  is  to  attempt  to 
make  zB,  the  vector  which  starts  the  next  Arnoidi  step,  equal  to  the  sum  of  the  eigenvectors  u-, 
i—l,..r.  For  this  purpose  suppose  that  for  each  approximate  eigenvector  Uj  we  have  Uj—Tf.Uj  +  «., 
where  the  vector  has  no  components  in  Uj,..ur.  Then: 

ln“  *lV»l  +  f272u2+..fT1rUr  +  < 

where 

<  -  E  Vi 

►i  1 1 

Near  convergence  Itt-J  is  close  to  one  and  ||«.||  is  small.  The  result  of  n^  steps  of  the  Chebyshev 
iteration  applied  to  z0  will  be  a  vector  zB  of  the  form: 

‘a  m  *iVl  +  *  2*2  "h  “s  +"  K~?  7r  ur  +  pD(A)< 

Since  c  has  no  components  in  u-,  i"l,.j,  p0(A)<  tends  to  zero  faster  than  the  first  r  terms,  as  n 
tends  to  infinity.  Henee,  taking  fj— itPwill  give  a  vector  whieh  has  components  y.  in  the  eigenvectors 
Up  Since  near  convergence  this  is  a  satisfactory  choice. 
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Another  possibility  suggested  in  [30]  for  the  iterative  Arnoldi  process  is  to  weight  the  combination 
of  5;  according  to  the  accuracy  obtained  after  an  Arnold!  step,  for  example: 

(r  II  (a-V)  5  H 

Notice  that  the  residuals  of  two  complex  conjugate  approximate  eigenelemnts  are  equal,  so  this 
choice  will  also  lead  to  a  real  Sq.  The  purpose  of  weighting  a  vector  u-  by  its  residual  norm  is  to 
attempt  to  balance  the  accuracy  between  the  different  eigenelemetns  that  would  be  obtained  at  the 
next  Arnoldi  step.  Thus  if  too  much  accuracy  is  obtained  for  u{  versus  the  other  approximate 
eigenvectors,  the  above  choice  of  the  (/s  will  put  less  weight  on  u,  and  more  on  the  other  vectors  in 
order  to  attempt  to  reduce  the  advantage  of  u{  in  the  next  Arnoldi  step. 

In  the  experiments  reported  later,  we  have  only  eonsiderd  the  first  possibility. 

4.3.  Choosing  the  parameters  m  and  n 

The  number  of  .Arnoldi  steps  m  and  the  number  of  Chebyshev  steps  n  are  important  parameters 
that  alTect  the  effectiveness  of  the  method.  Since  we  want  to  obtain  more  eigenvalues  than  the  r 
desired  ones,  in  order  to  nse  diem  for  acquiring  the  parameters  of  the  ellipse,  m  should  be  at  least 
r+2  (to  be  able  to  compute  a  complex  pair).  In  practice,  however,  it  is  preferable  to  take  m  several 
times  larger  than  r.  In  typical  runs  m  is  at  least  3r  or  4r  but  can  very  well  be  much  larger  if  storage 
permits  it.  It  is  also  possible  to  change  m  dynamically  instead  of  keeping  it  fixed  to  a  certain  value 
but  this  will  not  be  considered  here. 

When  chosing  n,  we  have  to  take  into  account  the  following  facts: 

•  Taking  n  too  small  may  result  in  a  slowing  down  of  the  algorithm;  ultimately  when  n*<*0, 
the  method  becomes  the  simple  iterative  Arnoldi  method. 

•  It  may  not  be  effective  to  piek  n  too  large:  otherwise  the  vector  tn  may  become  nealry  an 
eigenvector  which  could  be  troublesome  for  the  Arnoldi  process. 

Recalling  that  the  component  in  the  direction  of  u(  will  remain  constant  while  those  in  Uj,i-»2,..r, 
will  be  of  the  same  order  as  «c.'n  we  should  attempt  to  avoid  having  a  vector  tB  which  is  entirely  in 
the  direction  of  u,.  This  can  be  done  by  requiring  that  all  i»2,...r  be  no  less  than  a  cetain 
tolerance  r,  i.e.: 


(22) 


n  lof(  r )/  log  (  k.  ] 

where  k.  is  the  largest  convergence  ratio  among  i"2,..r. 

Other  practical  factors  should  also  enter  into  consideration.  For  example,  it  is  desirable  that  a 
maximum  number  of  Chebyshev  steps  nB&s  be  fixed  by  the  user.  Also  in  ease  we  are  close  to 
convergence,  we  should  avoid  to  perform  an  unnecessaryly  large  number  of  steps  as  might  be  dictated 
by  a  straightforward  application  of  (22). 

6.  Application  to  the  subspaee  iteration  algorithm 

5.1.  The  basic  subspace  iteration  algorithm 
The  subspace  iteration  method,  or  simultaneous  iteration  method,  can  be  regarded  as  a  (Galerkin) 
projection  method  onto  a  subspace  of  the  form  AnX,  where  Xs«[xj,..xm]  is  an  intiai  system  of  m 
linearly  independent  vectors.  There  are  many  versions  of  the  method  [4,  13,  30,  37],  but  a  very 
simple  one  is  the  following: 

1.  Start:  Q  «■  X 

2.  Iteration:  Compute  Q  «■  A"Q 

3.  Projection  step:  Orthonorm  aliie  Q  and  get  eigenvalues  and  eigenvectors  of  C»QTAQ. 
Compute  Q«a  QF,  where  F  is  the  matrix  of  eigenvectors  of  C. 

4.  Convergence  teat:  If  Q  is  not  a  satisfactory  set  of  approximate  eigenvectors  go  to  2. 


The  algorithm  presented  in  [13]  is  equivalent  to  the  above  algorithm  except  that  the  approximate 
eigenelements  are  computed  without  having  to  ortho  normalise  Q.  The  SRRIT  algorithm  presented  by 
Stewart  [30,  37]  aims  at  computing  an  orthonormal  basis  Q  of  the  invariant  subspaces  rather  than  a 
basis  formed  of  eigenvectors.  It  is  also  mathemaieally  equivalent  to  the  above  in  the  restricted  sense 
that  the  corresponding  invariant  subspaces  are  theoretically  identical.  We  should  point  out  that  this 
final  approach  is  more  robust  because  an  eigen  basis  of  the  invariant  subspace  may  mi  exist  or  may 
be  badly  conditioned  which  will  lead  to  serious  difficulties  for  the  other  versions.  We  should  stress 
however  that  the  Chebyshev  acceleration  technique  can  be  applied  to  any  version  of  the  avbapoec 


iteration  although  it  will  only  be  described  for  the  simpler  version  presented  above. 

5.2.  Chebyshev  acceleration 

The  use  of  Chebyshev  polynomials  for  accelerating  the  sabspaee  iteration  was  suggested  by 
Rntishanser  [29,  40]  for  the  symmetric  case.  It  was  pointed  out  in  [31]  that  this  powerful  technique 
can  be  extended  to  the  nonsymmetric  case  but  no  explicit  algorithm  was  formulated  for  computing 
the  best  ellipse. 

We  will  use  the  same  notation  as  in  the  previous  sections.  Suppose  that  we  are  interested  in  the 
rightmost  r  eigenvalues  and  that  the  ellipse  E(d,e,a)  contains  the  set  R  of  all  the  remaining 
eigenvalues.  Then  the  principle  of  the  Chebyshev  acceleration  method  is  simply  to  replace  the  powers 
A"  in  the  first  part  of  the  basic  algorithm  described  above  by  pn(A)  where  pQ  is  the  polynomial 
defined  by  (6).  It  can  be  shown  [31]  that  the  approximate  eigenvector  5;,  i«l..j  converges  towards 
Uj,  as  Tn(a/c)/Tn[(Xj— <l)/c],  which,  using  arguments  similar  to  those  of  Section  2.2,  is  equivalent  to  17" 
with 


a  +  [a2-  l]1/2 


(23) 


The  above  convergence  ratio  can  be  far  better  than  the  l\.+iA;l  which  is  achieved  by  the  classical 
algorithm1. 


On  the  practical  side,  the  best  ellipse  is  obtained  dynamically  in  the  same  way  as  was  proposed  for 
the  Chebyshev-Arnoldi  process.  The  accelerated  algorithm  will  then  have  the  following  structure: 

1.  Start:  Q  ^  X 

2.  Iteration:  Compute  Q  *■  pn(A)Q 

3.  Projection  ttep:  Orthonorm alite  Q  and  get  eigenvalues  and  eigenvectors  of  C*««QTAQ. 
Compute  Q  <•  QF,  where  F  is  the  matrix  of  eigenvectors  of  C. 


'The  subspace  iteration  method  compute*  the  eigenvalue*  of  largest  modulil  Therefore,  the  regular  subtpact  iteration 
method  and  the  accelerated  method  are  comparable  only  when  the  r+1  rihgtmost  eigenvalue*  are  also  the  r+1  dominant 


4.  Convergence  test:  If  Q  is  a  satisfactory  set  of  approximate  egenveetors  then  stop,  else  get 
new  best  ellipse  and  go  to  2. 

Most  of  the  practicalities  described  for  the  Arnold!  process  extend  naturally  to  this  algorithm  and 
we  now  discuss  briefly  a  few  of  them. 

1.  Getting  the  beet  ellipse.  The  construction  of  the  best  ellipse  is  identical  with  that  seen  in 
Section  4.1.  The  only  difficulty  we  might  encounter  is  that  the  extra  eigenvalues  used  to  build  the 
best  ellipse  are  now  leas  accurate  in  general  than  those  provided  by  the  more  powerful  Amoidi 
technique.  More  care  must  therefore  be  taken  in  order  to  avoid  building  an  ellipse  based  on  inacurate 
eigenvalues  as  this  may  slow  down  considerably  the  algorithm. 

2.  Parameters  n  and  m.  Here,  one  can  take  advantage  of  the  abundunt  work  on  subspace 
iteration  available  in  the  literature.  All  we  have  to  do  is  replace  the  convergence  ratios  |Xj/Xr+1|  of 
the  basic  subspace  iteration  by  the  new  ratios  tf.  of  (23).  For  example,  one  way  to  determine  the 
number  of  Chebyshev  steps  n,  proposed  in  {29}  and  in  {14}  is: 

n**  y[l  +  log  (c'tylogfa,  )J 

where  c  is  some  para  mater  depending  on  the  unit  round  off.  The  reason  for  this  choice  is  to  attempt 
to  avoid  the  rounding  errors  to  be  grow  beyond  the  level  of  the  error  in  the  most  slowly  converging 
eigenvector.  The  parameter  n  is  also  limited  from  above  by  a  user  supplied  bound  nmu,  and  by  the 
fact  that  if  we  are  close  to  convergence  a  smaller  n  can  be  determined  to  ensure  convergence  at  the 
next  projection  step. 

The  same  comments  as  in  the  Arnoldj-Chebyshev  method  can  be  made  concerning  the  choice  of  m, 
namely  that  m  should  be  at  least  r+2,  but  preferably  even  larger  although  in  a  leaser  extent  than  for 
Arnokli.  Note  that  for  the  symmetric  case  it  is  often  suggested  to  take  m»2r  or  m-«3r. 

S.  Deflation.  Another  special  feature  of  the  subspaee  iteration  is  the  deflation  technique  which 
consists  in  working  only  with  the  non  converged  eigenvectors,  thus  'locking'  those  that  have  already 
converged,  see  [14,  29,  30].  Clearly,  this  can  be  applied  to  the  accelerated  subspaee  iteration  as  well 
and  will  enhance  its  efficiency. 
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6.  Numerical  experiments 

The  numerical  experiments  described  in  this  section  have  been  performed  on  a  VAXl  1-780 
computer  using  double  precision  (unit  round-off  <«6.9xl0~18). 

0.1.  An  example  of  Markor  Chain  modelling 

An  interesting  class  of  test  examples  described  by  Stewart  [37]  deals  with  the  computation  of  the 
steady  state  probabilities  of  a  Markov  chain.  This  example  models  a  random  walk  on  a  (k+1)  by 
(k+1)  triangular  grid. 


j=6  * 
j=5  * 
j=4  * 

j=3  * 
j=2  * 
j=l  ♦ 
j=0  v 


* 

*  * 

•  *  * 

*  *  *  * 

*  *  *  *  * 


i=0  i=l  i =2  i=3  i=4  i=5  i=6 


Figure  8-l»  Random  walk  on  a  triangular  grid 


A  particuie  moves  randomly  on  the  grid  by  jumping  to  one  of  the  (at  most)  four  adjacent  grid 
points,  see  figure.  The  probability  of  jumping  from  the  node  (ij)  to  either  of  the  nodes  (i-1  j)  or 
(ij-1)  is  given  by: 


Pd(id)- 


lli 

2k 


this  probability  being  doubled  if  either  of  i  or  j  is  zero.  The  probability  of  jumping  from  the  node 
(ij)  to  either  of  the  nodes  (i+ 1  j)  or  (ij-1)  is  given  by 


P«(»  j)  —  pd(U) 
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stopping  criterion  which,  in  [37],  deals  with  the  two  dominant  eigenvalues  1  and  -1  (-1  is  also  known 
to  be  an  eigenvalue).  An  important  observation  is  that  for  the  same  block  site  m,  the  performance  is 
better  with  the  larger  n _ “50  than  with  n _ .“20. 

9  mix  mu 

The  next  table  shows  the  same  example  treated  by  the  Chebyshev  accelerated  snbspaee  iteration. 


Table  2:  Chebyshev  -  Subspace  Iteration 


■ 

rnras 

Iterstions 

Mstris  -  Vector 
Multiplications 

Esecution 
tines  (Sec.) 

Residual 

norms 

6 

50 

25 

1019 

60.2 

1.2  E-07 

6 

20 

30 

645 

41.1 

4.6  E-06 

8 

20 

42 

903 

59.7 

4.9  E-06 

8 

50 

28 

1063 

66.0 

3.8  E-09 

10 

20 

45 

909 

64.3 

1.0  E-06 

10 

50 

27 

979 

62.3 

1.9  E-07 

The  stopping  criterion  and  the  initial  set  X  were  the  same  as  for  the  previous  test.  Notice  that 
here  the  effect  of  the  upper  limit  nmu  of  the  number  of  Chebyshev  iterations  can  be  quite  important, 
as  for  example  when  m“6.  As  opposed  to  the  observation  made  above  for  the  non  acelerated 
algorithm,  the  performance  is  now  better  for  smaller  values  of  the  parameter  n  .  The  reason  for 
this  is  provided  by  a  close  examination  of  the  successive  ellipses  that  are  adaptively  computed  by  the 
process.  It  is  possible  to  observe  that  when  the  ellipse  does  not  accurately  represent  the  convex  hull 
of  the  remaining  eigenvalues,  a  larger  nmu  leads  to  wasting  an  important  amount  of  computational 
work  before  having  the  chance  of  evaluating  new  parameters.  Thus,  for  smaller  values  of  nmu,  the 
process  has  a  better  ability  to  correct  itself  by  computing  a  better  ellipse  more  frequently.  This  is 
less  critical  with  the  Amoldi  process  because  the  eigenvalues  provided  by  Arnoldi's  method  are 
usually  more  accurate. 

It  is  instructive  to  compare  the  above  performances  with  those  of  the  iterative  Amoldi  method  and 
of  Chebyshev  Amoldi  method.  The  next  two  tables  summarise  the  results  obtained  with  the 
iterative  Amoldi  method  (Table  3)  and  the  Arnoldi-Chebyshev  method  (Table  4).  The  stopping 
criterion  is  the  same  as  before,  and  the  initial  vector  used  in  the  first  Amoldi  iteration  is  random. 


24 


Table  3:  Iterative  Arnold! 


Arnoldi 


Matrix  -  Vector 
Multi  pi i cations 

Execution 
tisas  (Sac.) 

Rasidual 

norms 

7.5  E-06 

9.3  E-06 

7.3  E-06 
6.2  E-06 


Table  4:  Chebyshev  -  Arnold! 


Arnotdi  Matrix  -  Vactor  Execution 

calls  Multiplications  tisas  (Sac.) 


Residual 


8.9  E-06 

3.9  E-07 
5.0  E-06 

7.1  E-06 
6.8  E-06 

3.2  E-10 
1.5  E-07 
*.3  E-09 


The  results  of  Table  S  constitute  a  considerable  improvement  over  those  of  the  subspaee  iteration, 
both  in  execution  time  and  in  number  of  matrix  by  vector  multiplications.  Notice  that  in  this 
example  we  are  also  able  to  reduce  the  execution  time  by  a  factor  of  nearly  2.5  from  the  iterative 
Arnoldi  method. 
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6.2.  Computing  several  eigenvalues 

The  above  experiments  deal  with  the  computation  of  only  one  eigenpair  and  we  would  like  next  to 
compare  the  performances  of  oar  methods  on  problems  dealing  with  several  eigenvalues.  Consider 
the  following  partial  differential  linear  operator  on  the  unit  square,  with  the  Dirichlet  boundary 
conditions,  derived  from  [7]: 


d  dn  d  Bn  dgu  &a 

^  dx  dy  dy  dx  dx 

The  functions  a,  b,  and  g  are  defined  by: 

a(x,y)  —  «"**  ;  b (x,y)  — e*7 

g(x,y)  —  T(x+y)  ;  flx.y)  —  j^.y 


(24) 


Discretizing  the  operator  (24)  by  centered  differences  with  mesh  size  h— »l/(p+l)  gives  rise  to  a 
nonsymmetric  matrix  A  of  size  N»*n2.  The  parameter  7  is  useful  for  varying  the  degree  of  symmetry 
of  A. 

Taking  p— *30  and  t*»20,  yields  a  matrix  of  dimension  900  which  is  not  nearly  symmetric.  We 
computed  the  4  rightmost  eigenvalues  of  A  by  the  Araoldi-Chebyshev  algorithm  using  m»15  and 
n  ..■»  80  and  obtained  the  following  results: 

nix 


A.  „  »  9.4429  ±  1.7290  ;  Residual  norm:  5.4  B-13 
A3  4  »  8.9501  dk  1.3381  ;  Residual  norm:  8.4  E-08 

Total  number  of  matrix  by  vector  multiplications  required:  110 
CPU  time:  20.0  see. 

The  initial  vector  was  a  random  vector,  and  the  stoping  criterion  was  that  the  residual  norm  be 
less  than  <a»10‘*.  Details  of  the  execution  are  as  follows:  first  an  Arnoidi  iteration  (15  steps)  was 
performed  and  computed  the  parameters  d—3.803  ,  c2— 14.30.  Then  80  steps  of  the  Chebyshev 
iterations  were  carried  out  and  finally  another  Arnoidi  purification  step  was  taken  and  the  stopping 
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criterion  was  satisfied.  Note  that  we  could  have  reduced  the  amount  of  work  by  using  a  smaller  n  in 
the  Chebyshev  iteration. 

The  same  eigenvalues  were  computed  by  the  subspaee  iteration  method  using  the  same  stopping 
criterion  and  the  parameters  m «■$  and  nmM*50.  The  results  were  delivered  after  220  iterations 
which  consumed  1708  matrix  by  vector  multiplications  and  220  CPU  seconds. 

A  similar  run  with  the  accelerated  subspace  iteration,  with  nmax«15,  took  104  iterations 
corresponding  to  a  total  of  028  matrix  by  veetor  multiplications  and  188  seconds  of  CPU  time. 
Observe  that  the  gain  in  execution  time  does  not  reflect  the  gain  in  the  number  of  matrix  by  vector 
multiplications  because  the  overhead  in  the  accelerated  subspace  iteration  is  substantial. 

We  omitted  to  discuss  in  detail  the  use  of  our  accelerated  algorithms  for  the  computation  of  the 
eigenvalues  with  algebraically  smallest  real  parts,  but  the  development  is  identical  to  that  for  the 
eigenvalues  with  largest  real  pans.  It  suffices  to  relabel  the  eigenvalues  in  increasing  order  of  their 
real  parts  (instead  of  decreasing  order).  In  the  following  test  we  have  computed  the  four  eigenvalues 
of  smallest  real  parts  of  the  matrix  A  defined  above.  Convergence  has  been  more  difficult  to  achieve 
than  in  the  previous  test.  With  m**20,  nmM— 250,  the  Araoldi- Chebyshev  code  satisfied  the 
convergence  criterion  with  <«»10*4,  after  3  calls  to  Araoldi  and  a  total  of  527  matrix  by  veetor 
multiplications.  The  execution  time  was  106  sec.  In  order  to  obtain  the  smallest  eigenvalues  with  the 
regular  Chebyshev  iteration,  we  had  to  shift  A  by  a  certain  scalar  so  that  the  eigenvalues  of  smallest 
real  parts  become  dominant.  We  used  the  shift  7.0,  i.e.  the  subspace  iteration  algorithm  was  applied 
to  the  shifted  matrix  A-7.I,  and  the  resulting  eigenvalues  are  shifted  back  by  7.  to  obtain  the 
eigenvalue  of  A.  The  process  with  m«10  and  nfflJI«50  was  quite  slow  since  it  took  a  total  of  3925 
matrix  by  vector  multiplications  and  525  seconds  to  reach  the  same  stopping  criterion  as  above. 

The  accelerated  subspace  iteration  did  not  perform  better,  however,  since  it  required  4010  matrix 
by  veetor  multiplication  to  converge  with  a  total  time  of  730  seconds.  Here  we  used  m«10  and 
nmmx«-25.  The  reason  for  this  misbehavior  was  that  the  algorithm  encountered  serious  difficulties  to 
obtain  a  good  ellipse  as  could  be  observed  from  the  erratic  variation  of  the  parameters  d  and  e.  We 
believe  that  one  important  conclusion  from  this  is  that  the  Chebyshev  snbspace  iteration  can  become 
unreliable  for  some  shapes  of  spectra  or  when  the  eigenvalues  are  clustered  in  an  infavorable  way.  If 
the  spectrum  is  entirely  real  (or  almost  real)  this  misbehavior  is  unlikely  to  happen  in  general. 
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Perhaps,  another  important  remark  raised  by  this  last  experiment  is  that  fitting  a  general  spectrum 
with  an  ellipse  may  not  be  the  best  idea.  If  we  were  allowed  to  use  domains  E  more  general  than 
ellipses,  then  the  problem  of  fitting  the  spectrum  would  have  been  made  easier.  Clearly,  the  resulting 
best  polynomials  pn  would  not  be  Chebyshev  polynomials  but  this  does  not  constitute  a  major 
disadvantage.  Further  investigation  in  this  direction  is  worth  pursuing. 

7.  Conclusion 

The  purpose  of  this  paper  was  to  show  one  way  of  using  Chebyshev  polynomials  for  accelerating 
nonsymmetric  eigenvalue  algorithms.  The  numerical  experiments  have  confirmed  the  expectation 
that  such  an  acceleration  can  be  quite  effective.  To  conclude,  we  would  like  to  point  out  the 
following  facts: 

•  It  is  not  clear  that  representing  general  spectra  by  ellipses  is  the  best  that  can  be  done. 

For  the  solution  of  linear  systems,  general  domains  have  been  considered  by  Smolarski 
and  Saylor  [35]  who  use  orthogonal  polynomials  in  the  complex  plane.  Thus  far,  this  does 
not  seem  to  have  been  extended  to  solving  nonsymmetric  eigenvalue  problems. 

•  Another  way  of  combining  polynomial  iteration  (e.g.  Chebyshev  Iteration)  or,  more 
generally  rational  iteration,  with  Arnoldi’s  method  has  recently  been  proposed  by 
Ruhe  [28].  Briefly  described  the  idea  is  to  carry  out  Arnoldi’s  algorithm  with  the  matrix 
d(A),  o  being  a  suitably  chosen  rational  function.  Then,  an  ingenious  relation  permits  to 
calculate  the  eigenvalues  of  A  from  the  Hessenberg  matrix  built  by  the  Arnoldi  process. 

•  We  have  selected  Arnoldi's  method  as  a  purification  process,  perhaps  unfairly  to  other 
similar  processes  which  may  be  just  as  powerful  as  Arnoldi’s.  One  such  alternative  is  the 
unsymmetric  Lanczos  algorithm  [19,  26,  38].  Another  possibility  which  we  have  omitted 
to  describe  is  a  projection  process  onto  the  latest  m  vectors  of  the  Chehpshev  iteration. 

This  can  be  realised  at  less  cost  than  m  steps  of  Arnoldi’s  method  although  it  is  not 
known  whether  the  overall  resulting  algorithm  is  more  effective. 
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